Goto

Collaborating Authors

 Mediterranean Sea


Ancient origin of an urban underground mosquito Science

Science

Understanding how life is adapting to urban environments represents an important challenge in evolutionary biology. In this work, we investigate a widely cited example of urban adaptation, Culex pi...


Localized Forest Fire Risk Prediction: A Department-Aware Approach for Operational Decision Support

Caron, Nicolas, Guyeux, Christophe, Noura, Hassan, Aynes, Benjamin

arXiv.org Artificial Intelligence

Forest fire prediction involves estimating the likelihood of fire ignition or related risk levels in a specific area over a defined time period. With climate change intensifying fire behavior and frequency, accurate prediction has become one of the most pressing challenges in Artificial Intelligence (AI). Traditionally, fire ignition is approached as a binary classification task in the literature. However, this formulation oversimplifies the problem, especially from the perspective of end-users such as firefighters. In general, as is the case in France, firefighting units are organized by department, each with its terrain, climate conditions, and historical experience with fire events. Consequently, fire risk should be modeled in a way that is sensitive to local conditions and does not assume uniform risk across all regions. This paper proposes a new approach that tailors fire risk assessment to departmental contexts, offering more actionable and region-specific predictions for operational use. With this, we present the first national-scale AI benchmark for metropolitan France using state-of-the-art AI models on a relatively unexplored dataset. Finally, we offer a summary of important future works that should be taken into account. Supplementary materials are available on GitHub.


Machine Understanding of Scientific Language

Wright, Dustin

arXiv.org Artificial Intelligence

Scientific information expresses human understanding of nature. This knowledge is largely disseminated in different forms of text, including scientific papers, news articles, and discourse among people on social media. While important for accelerating our pursuit of knowledge, not all scientific text is faithful to the underlying science. As the volume of this text has burgeoned online in recent years, it has become a problem of societal importance to be able to identify the faithfulness of a given piece of scientific text automatically. This thesis is concerned with the cultivation of datasets, methods, and tools for machine understanding of scientific language, in order to analyze and understand science communication at scale. To arrive at this, I present several contributions in three areas of natural language processing and machine learning: automatic fact checking, learning with limited data, and scientific text processing. These contributions include new methods and resources for identifying check-worthy claims, adversarial claim generation, multi-source domain adaptation, learning from crowd-sourced labels, cite-worthiness detection, zero-shot scientific fact checking, detecting exaggerated scientific claims, and modeling degrees of information change in science communication. Critically, I demonstrate how the research outputs of this thesis are useful for effectively learning from limited amounts of scientific text in order to identify misinformative scientific statements and generate new insights into the science communication process


Leveraging Novel Ensemble Learning Techniques and Landsat Multispectral Data for Estimating Olive Yields in Tunisia

Kefi, Mohamed, Pham, Tien Dat, Nguyen, Thin, Tjoelker, Mark G., Devasirvatham, Viola, Kashiwagi, Kenichi

arXiv.org Artificial Intelligence

Olive production is an important tree crop in Mediterranean climates. However, olive yield varies significantly due to climate change. Accurately estimating yield using remote sensing and machine learning remains a complex challenge. In this study, we developed a streamlined pipeline for olive yield estimation in the Kairouan and Sousse governorates of Tunisia. We extracted features from multispectral reflectance bands, vegetation indices derived from Landsat-8 OLI and Landsat-9 OLI-2 satellite imagery, along with digital elevation model data. These spatial features were combined with ground-based field survey data to form a structured tabular dataset. We then developed an automated ensemble learning framework, implemented using AutoGluon to train and evaluate multiple machine learning models, select optimal combinations through stacking, and generate robust yield predictions using five-fold cross-validation. The results demonstrate strong predictive performance from both sensors, with Landsat-8 OLI achieving R2 = 0.8635 and RMSE = 1.17 tons per ha, and Landsat-9 OLI-2 achieving R2 = 0.8378 and RMSE = 1.32 tons per ha. This study highlights a scalable, cost-effective, and accurate method for olive yield estimation, with potential applicability across diverse agricultural regions globally.


SF$^2$Bench: Evaluating Data-Driven Models for Compound Flood Forecasting in South Florida

Zheng, Xu, Lin, Chaohao, Chen, Sipeng, Chen, Zhuomin, Shi, Jimeng, Cheng, Wei, Obeysekera, Jayantha, Liu, Jason, Luo, Dongsheng

arXiv.org Artificial Intelligence

Forecasting compound floods presents a significant challenge due to the intricate interplay of meteorological, hydrological, and oceanographic factors. Analyzing compound floods has become more critical as the global climate increases flood risks. Traditional physics-based methods, such as the Hydrologic Engineering Center's River Analysis System, are often time-inefficient. Machine learning has recently demonstrated promise in both modeling accuracy and computational efficiency. However, the scarcity of comprehensive datasets currently hinders systematic analysis. Existing water-related datasets are often limited by a sparse network of monitoring stations and incomplete coverage of relevant factors. To address this challenge, we introduce SF2Bench, a comprehensive time series collection on compound floods in South Florida, which integrates four key factors: tide, rainfall, groundwater, and human management activities (gate and pump controlling). This integration allows for a more detailed analysis of the individual contributions of these drivers to compound flooding and informs the development of improved flood forecasting approaches. To comprehensively evaluate the potential of various modeling paradigms, we assess the performance of six categories of methods, encompassing Multilayer Perceptrons, Convolutional Neural Networks, Recurrent Neural Networks, Graph Neural Networks, Transformers, and Large Language Models. We verified the impact of different key features on flood forecasting through experiments. Our analysis examines temporal and spatial aspects, providing insights into the influence of historical data and spatial dependencies. The varying performance across these approaches underscores the diverse capabilities of each in capturing complex temporal and spatial dependencies inherent in compound floods.


Wildfire spread forecasting with Deep Learning

Anastasiou, Nikolaos, Kondylatos, Spyros, Papoutsis, Ioannis

arXiv.org Artificial Intelligence

Accurate prediction of wildfire spread is crucial for effective risk management, emergency response, and strategic resource allocation. In this study, we present a deep learning (DL)-based framework for forecasting the final extent of burned areas, using data available at the time of ignition. We leverage a spatio-temporal dataset that covers the Mediterranean region from 2006 to 2022, incorporating remote sensing data, meteorological observations, vegetation maps, land cover classifications, anthropogenic factors, topography data, and thermal anomalies. To evaluate the influence of temporal context, we conduct an ablation study examining how the inclusion of pre- and post-ignition data affects model performance, benchmarking the temporal-aware DL models against a baseline trained exclusively on ignition-day inputs. Our results indicate that multi-day observational data substantially improve predictive accuracy. Particularly, the best-performing model, incorporating a temporal window of four days before to five days after ignition, improves both the F1 score and the Intersection over Union by almost 5% in comparison to the baseline on the test dataset. We publicly release our dataset and models to enhance research into data-driven approaches for wildfire modeling and response.


Regional climate projections using a deep-learning-based model-ranking and downscaling framework: Application to European climate zones

Loganathan, Parthiban, Zea, Elias, Vinuesa, Ricardo, Otero, Evelyn

arXiv.org Artificial Intelligence

Accurate regional climate forecast calls for high-resolution downscaling of Global Climate Models (GCMs). This work presents a deep-learning-based multi-model evaluation and downscaling framework ranking 32 Coupled Model Intercomparison Project Phase 6 (CMIP6) models using a Deep Learning-TOPSIS (DL-TOPSIS) mechanism and so refines outputs using advanced deep-learning models. Using nine performance criteria, five K\"oppen-Geiger climate zones -- Tropical, Arid, Temperate, Continental, and Polar -- are investigated over four seasons. While TaiESM1 and CMCC-CM2-SR5 show notable biases, ranking results show that NorESM2-LM, GISS-E2-1-G, and HadGEM3-GC31-LL outperform other models. Four models contribute to downscaling the top-ranked GCMs to 0.1$^{\circ}$ resolution: Vision Transformer (ViT), Geospatial Spatiotemporal Transformer with Attention and Imbalance-Aware Network (GeoSTANet), CNN-LSTM, and CNN-Long Short-Term Memory (ConvLSTM). Effectively capturing temperature extremes (TXx, TNn), GeoSTANet achieves the highest accuracy (Root Mean Square Error (RMSE) = 1.57$^{\circ}$C, Kling-Gupta Efficiency (KGE) = 0.89, Nash-Sutcliffe Efficiency (NSE) = 0.85, Correlation ($r$) = 0.92), so reducing RMSE by 20% over ConvLSTM. CNN-LSTM and ConvLSTM do well in Continental and Temperate zones; ViT finds fine-scale temperature fluctuations difficult. These results confirm that multi-criteria ranking improves GCM selection for regional climate studies and transformer-based downscaling exceeds conventional deep-learning methods. This framework offers a scalable method to enhance high-resolution climate projections, benefiting impact assessments and adaptation plans.


BESTAnP: Bi-Step Efficient and Statistically Optimal Estimator for Acoustic-n-Point Problem

Sheng, Wenliang, Zhao, Hongxu, Chen, Lingpeng, Zeng, Guangyang, Shao, Yunling, Hong, Yuze, Yang, Chao, Hong, Ziyang, Wu, Junfeng

arXiv.org Artificial Intelligence

We consider the acoustic-n-point (AnP) problem, which estimates the pose of a 2D forward-looking sonar (FLS) according to n 3D-2D point correspondences. We explore the nature of the measured partial spherical coordinates and reveal their inherent relationships to translation and orientation. Based on this, we propose a bi-step efficient and statistically optimal AnP (BESTAnP) algorithm that decouples the estimation of translation and orientation. Specifically, in the first step, the translation estimation is formulated as the range-based localization problem based on distance-only measurements. In the second step, the rotation is estimated via eigendecomposition based on azimuth-only measurements and the estimated translation. BESTAnP is the first AnP algorithm that gives a closed-form solution for the full six-degree pose. In addition, we conduct bias elimination for BESTAnP such that it owns the statistical property of consistency. Through simulation and real-world experiments, we demonstrate that compared with the state-of-the-art (SOTA) methods, BESTAnP is over ten times faster and features real-time capacity in resource-constrained platforms while exhibiting comparable accuracy. Moreover, for the first time, we embed BESTAnP into a sonar-based odometry which shows its effectiveness for trajectory estimation.


Shrub of a thousand faces: an individual segmentation from satellite images using deep learning

Khaldi, Rohaifa, Tabik, Siham, Puertas-Ruiz, Sergio, de Giles, Julio Peñas, Correa, José Antonio Hódar, Zamora, Regino, Segura, Domingo Alcaraz

arXiv.org Artificial Intelligence

Monitoring the distribution and size structure of long-living shrubs, such as Juniperus communis, can be used to estimate the long-term effects of climate change on high-mountain and high latitude ecosystems. Historical aerial very-high resolution imagery offers a retrospective tool to monitor shrub growth and distribution at high precision. Currently, deep learning models provide impressive results for detecting and delineating the contour of objects with defined shapes. However, adapting these models to detect natural objects that express complex growth patterns, such as junipers, is still a challenging task. This research presents a novel approach that leverages remotely sensed RGB imagery in conjunction with Mask R-CNN-based instance segmentation models to individually delineate Juniperus shrubs above the treeline in Sierra Nevada (Spain). In this study, we propose a new data construction design that consists in using photo interpreted (PI) and field work (FW) data to respectively develop and externally validate the model. We also propose a new shrub-tailored evaluation algorithm based on a new metric called Multiple Intersections over Ground Truth Area (MIoGTA) to assess and optimize the model shrub delineation performance. Finally, we deploy the developed model for the first time to generate a wall-to-wall map of Juniperus individuals. The experimental results demonstrate the efficiency of our dual data construction approach in overcoming the limitations associated with traditional field survey methods. They also highlight the robustness of MIoGTA metric in evaluating instance segmentation models on species with complex growth patterns showing more resilience against data annotation uncertainty. Furthermore, they show the effectiveness of employing Mask R-CNN with ResNet101-C4 backbone in delineating PI and FW shrubs, achieving an F1-score of 87,87% and 76.86%, respectively.


Spotting Virus from Satellites: Modeling the Circulation of West Nile Virus Through Graph Neural Networks

Bonicelli, Lorenzo, Porrello, Angelo, Vincenzi, Stefano, Ippoliti, Carla, Iapaolo, Federica, Conte, Annamaria, Calderara, Simone

arXiv.org Artificial Intelligence

The occurrence of West Nile Virus (WNV) represents one of the most common mosquito-borne zoonosis viral infections. Its circulation is usually associated with climatic and environmental conditions suitable for vector proliferation and virus replication. On top of that, several statistical models have been developed to shape and forecast WNV circulation: in particular, the recent massive availability of Earth Observation (EO) data, coupled with the continuous advances in the field of Artificial Intelligence, offer valuable opportunities. In this paper, we seek to predict WNV circulation by feeding Deep Neural Networks (DNNs) with satellite images, which have been extensively shown to hold environmental and climatic features. Notably, while previous approaches analyze each geographical site independently, we propose a spatial-aware approach that considers also the characteristics of close sites. Specifically, we build upon Graph Neural Networks (GNN) to aggregate features from neighbouring places, and further extend these modules to consider multiple relations, such as the difference in temperature and soil moisture between two sites, as well as the geographical distance. Moreover, we inject time-related information directly into the model to take into account the seasonality of virus spread. We design an experimental setting that combines satellite images - from Landsat and Sentinel missions - with ground truth observations of WNV circulation in Italy. We show that our proposed Multi-Adjacency Graph Attention Network (MAGAT) consistently leads to higher performance when paired with an appropriate pre-training stage. Finally, we assess the importance of each component of MAGAT in our ablation studies.